-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/revive advanced rag #932
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to c8024ba in 1 minute and 2 seconds
More details
- Looked at
973
lines of code in26
files - Skipped
1
files when reviewing. - Skipped posting
4
drafted comments based on config settings.
1. py/core/main/app_entry.py:58
- Draft comment:
The BASE_URL environment variable is removed from the compose.yaml, but it's still being referenced in the code. Ensure that all references to BASE_URL are removed or handled appropriately to avoid runtime errors. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is incorrect because the BASE_URL reference has been removed from the code in the diff. There is no need for further action regarding BASE_URL in this PR.
I might be missing if there are other parts of the codebase not shown in the diff that still reference BASE_URL, but based on the diff provided, the comment is not applicable.
The task is to review the diff provided, and within this context, the BASE_URL reference has been removed, making the comment unnecessary.
Remove the comment as it incorrectly states that BASE_URL is still being referenced in the code.
2. py/core/main/app_entry.py:65
- Draft comment:
The BASE_URL environment variable is removed from the compose.yaml, but it's still being referenced in the code. Ensure that all references to BASE_URL are removed or handled appropriately to avoid runtime errors. - Reason this comment was not posted:
Marked as duplicate.
3. py/core/main/app_entry.py:66
- Draft comment:
The BASE_URL environment variable is removed from the compose.yaml, but it's still being referenced in the code. Ensure that all references to BASE_URL are removed or handled appropriately to avoid runtime errors. - Reason this comment was not posted:
Marked as duplicate.
4. py/core/main/app_entry.py:67
- Draft comment:
The BASE_URL environment variable is removed from the compose.yaml, but it's still being referenced in the code. Ensure that all references to BASE_URL are removed or handled appropriately to avoid runtime errors. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_uEUGtDWgn42I8qK5
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
emrgnt-cmplxty
added a commit
that referenced
this pull request
Aug 23, 2024
* Feature/merge graphrag group mgmt (#876) * add group ids to document abstraction, first steps * extend group permissions * up * add tests for new group features * up * fixup auth * onboard extensive regression tests * adding regression tests * finish tests * rm selenium * test observability * uncomment tests * checkin first set of group tests * modify search, passing vector tests * checkin work * full delete logic * update search to use new filters * check in * Clean up * Check in * add search * tests/test_end_to_end.py::test_ingest_txt_document passing * cleanup logging * make schemas explicit * move to run logger abstraction * cleanup some test workflows * revive tests * tweak to pass tests * tweak rrf * finish hybrid search cleanup * fixup on regr tests, regen payloads * refresh payloads * refactor api model * Feature/refactor api model (#868) * cleanup imports * flake and cleanup * coherent global import / export structure * add ingestion response models * add management response models * cleanups * checkin work on routes * remove request models * last fixes * merge * add user / group gating * working test groups * updating client --------- Co-authored-by: NolanTrem <[email protected]> * Clean up API (#878) * Get running * fixes in sdk * Add in more fixes * Feature/merge dev owen changes (#880) * add group ids to document abstraction, first steps * extend group permissions * up * add tests for new group features * up * fixup auth * onboard extensive regression tests * adding regression tests * finish tests * rm selenium * test observability * uncomment tests * checkin first set of group tests * modify search, passing vector tests * checkin work * full delete logic * update search to use new filters * check in * Clean up * Check in * add search * tests/test_end_to_end.py::test_ingest_txt_document passing * cleanup logging * make schemas explicit * move to run logger abstraction * cleanup some test workflows * revive tests * tweak to pass tests * tweak rrf * finish hybrid search cleanup * fixup on regr tests, regen payloads * refresh payloads * refactor api model * Feature/refactor api model (#868) * cleanup imports * flake and cleanup * coherent global import / export structure * add ingestion response models * add management response models * cleanups * checkin work on routes * remove request models * last fixes * merge * add user / group gating * working test groups * updating client * rename service to restructure * add get documents for group endpoint * fix client bugs * return delete format * merge cleanups * merge * finalize --------- Co-authored-by: NolanTrem <[email protected]> * Shreyas/graphrag test (#881) * add group ids to document abstraction, first steps * extend group permissions * up * add tests for new group features * up * fixup auth * onboard extensive regression tests * adding regression tests * finish tests * rm selenium * test observability * uncomment tests * checkin first set of group tests * modify search, passing vector tests * checkin work * full delete logic * update search to use new filters * check in * Clean up * Check in * add search * tests/test_end_to_end.py::test_ingest_txt_document passing * cleanup logging * make schemas explicit * move to run logger abstraction * cleanup some test workflows * revive tests * tweak to pass tests * tweak rrf * finish hybrid search cleanup * fixup on regr tests, regen payloads * refresh payloads * refactor api model * Feature/refactor api model (#868) * cleanup imports * flake and cleanup * coherent global import / export structure * add ingestion response models * add management response models * cleanups * checkin work on routes * remove request models * last fixes * merge * add user / group gating * sync * enrich * up * fix global search * rag * remove client.py * rm configs * rm configs --------- Co-authored-by: emrgnt-cmplxty <[email protected]> Co-authored-by: NolanTrem <[email protected]> Co-authored-by: emrgnt-cmplxty <[email protected]> * Feature/fix embedding pipe (#882) * up * fixup concurrency * fix ollama embeddings * fix batching with ollama * checkin all cleanups * rm kg cruft (#884) * rm kg cruft * tweaks * tweak 2 (#885) * Feature/fix retrieval endpoint cruft (#887) * tweak 2 * fix retrieval endpoint descriptions * Python SDK (#886) Clean up Python SDK and routes * Separate out SDK, add js and go sdk to monorepo (#888) * Add r2r-js sdk * Add go sdk * Pull out python sdk * remove venv * Update packages * Check in fixes * Remove alembic dependencies * Feature/merge w nolan (#894) * cleanup hybrid search * cleanups in * Fix structure * Make graspologic optional * fix rag stream (#895) * add py r2r (#896) * Clean up (#897) * fix agent (#898) * define `RAGAgentResponse` (#899) * Shreyas/unstructured (#900) * api + oss lib * rm pdb * rm poetry lock * update version * fixes * Feature/cleanup client obj logic (#901) * define `RAGAgentResponse` * cleanup client logic * Shreyas/tests (#889) * init * tests * rename service * api model * add * merge * rm restructure router * print descriptions * Refactor CLI (#903) * Rm files readded by git (#904) * Remove Execution Wrapper (#905) * Rm files readded by git * Fix merge botch * Feature/fix auth revive tests rebased (#906) * adding the client touch ups * fix auth, revive tests * add back tests * uncomment run auth workflow * decruft * refresh test kg * fixup toml (#908) * Feature/fix ingestion update (#909) * fixup toml * fix update * Fix CLI Tests (#912) Fix CLI tests * Shreyas/kg runtime cfg (#913) add kg runtime config * rename kgenrichmentresponse (#914) * Feature/add nltk hybrid expansion rebased (#917) * expand hybrid search with nltk * cleanups * cleanup hybrid search * format * add setup.py * update * add script (#918) * Fix bug in document chunks (#921) * Fix bug in update files (#923) * Shreyas/unstructured (#922) * fix dockerfiles * adding config * fix paths * mv unstructured dep to docker * clean * Update docker_utils.py * Update unstructured_parsing.py * Update r2r_chunking.py * Update app_entry.py * Feature/repair logging (#925) * fixing logs * fix * rm double logging (#929) * Configs (#926) * Fix config logic * Update config * Clean up cli entry point * Disable SSL when installing nltk wordnet (#930) * Fix analytics endpoint * Update OpenAI sdk calls (#933) * Feature/revive advanced rag (#932) * rm double logging * revive advanced rag examples * merge (#934) * sync model (#935) * Feature/remove version from ingestion end pt (#936) * sync model * remove ability to set version * tweak versions impl * fix version bug * Move docker (#938) * Move docker * remove from root * Clean up sdk/restructure.py * Fix js tests, completion scoring (#939) * Shreyas/unstructured docker image (#940) unstructured docker image * Update JS (#941) * Update models (#942) * Feature/complete group logic (#945) * fix group logic * up * Fix Dockerbuild, Symlink Readme (#944) * Add back tast prompt override and include title if availible * Fix docker, sym link readme * Fix compose file path * Shreyas/KG Search Result model (#937) * return type to kg_search_result * add model * local and global results * modify config * refresh should not be gated by auth (#946) * Linting sync (#947) * Remove email from refresh (#948) * Fix link to image * Feature/rm print cruft rebase (#953) * refresh should not be gated by auth * rm print cruft * black and sort * merge * rm * update api return type * Update Actions (#954) * Update Github Actions (#956) * Update Actions * Update actions * Shreyas/kgsearchresult model (#957) * return type to kg_search_result * add model * local and global results * modify config * add models * up * fix config path * fix models * Login and refresh token bug (#959) * Update Actions * Fix bug in login with refresh token * Point pytest to linux (#960) * collection docs (#955) * Feature/merge dev to main (#962) * merge dev and main * git rm * add back collection fix * fix docker builds (#963) * Running unstructured docker + code cleanups (#964) * Small bugfixes on prompts, return types (#965) * Fix failing CLI tests * NPM publish action * remove tarball * Feature/fix dev tests (#966) * update auth tests * fix tests * back and sort * decruft * revert back to gpt-4o --------- Co-authored-by: NolanTrem <[email protected]> Co-authored-by: Shreyas Pimpalgaonkar <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
Enhanced R2R system with improved configuration management, CLI commands, and advanced RAG functionalities, including new scripts and refined search and ingestion pipelines.
Key points:
BASE_URL
fromcompose.yaml
environment variables.py/cli/command_group.py
by removingconfig-path
,config-name
, andbase-url
options.serve
command inpy/cli/commands/server.py
to handleconfig-name
andconfig-path
.run_local_serve
andrun_docker_serve
inpy/cli/utils/docker_utils.py
to support new configuration handling.SerperClient
topy/core/__init__.py
for search integrations.user_id
optional inVectorSearchResult
inpy/core/base/abstractions/search.py
.run_hyde.py
and addedserve_with_hyde.py
inpy/core/examples/scripts/
.run_web_rag.py
toserve_with_web.py
andrun_web_multi_rag.py
toserve_with_web_hyde.py
.R2RBuilder
inpy/core/main/assembly/builder.py
to handleconfig_path
.ManagementService
inpy/core/main/services/management_service.py
to improve log and analytics handling.IngestionPipeline
inpy/core/pipelines/ingestion_pipeline.py
to streamline document processing.WebSearchPipe
inpy/core/pipes/other/web_search_pipe.py
for better search result handling.py/scripts/download_nltk_data.py
for secure downloads.Generated with ❤️ by ellipsis.dev